Overview

Brought to you by YData

Dataset statistics

Number of variables16
Number of observations891
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory346.5 KiB
Average record size in memory398.2 B

Variable types

Numeric6
Categorical8
Text2

Alerts

Age is highly overall correlated with Age_GroupHigh correlation
Age_Group is highly overall correlated with AgeHigh correlation
Family_Size is highly overall correlated with Family_Size_Category and 3 other fieldsHigh correlation
Family_Size_Category is highly overall correlated with Family_Size and 2 other fieldsHigh correlation
Fare is highly overall correlated with Family_Size and 1 other fieldsHigh correlation
Has_Cabin is highly overall correlated with Fare and 1 other fieldsHigh correlation
Parch is highly overall correlated with Family_Size and 1 other fieldsHigh correlation
Pclass is highly overall correlated with Has_CabinHigh correlation
Sex is highly overall correlated with Survived and 1 other fieldsHigh correlation
SibSp is highly overall correlated with Family_Size and 1 other fieldsHigh correlation
Survived is highly overall correlated with Sex and 1 other fieldsHigh correlation
Title is highly overall correlated with Sex and 1 other fieldsHigh correlation
PassengerId is uniformly distributedUniform
PassengerId has unique valuesUnique
Name has unique valuesUnique
SibSp has 608 (68.2%) zerosZeros
Parch has 678 (76.1%) zerosZeros
Fare has 15 (1.7%) zerosZeros

Reproduction

Analysis started2025-10-07 12:54:17.400590
Analysis finished2025-10-07 12:54:23.904229
Duration6.5 seconds
Software versionydata-profiling vv4.17.0
Download configurationconfig.json

Variables

PassengerId
Real number (ℝ)

Uniform  Unique 

Distinct891
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean446
Minimum1
Maximum891
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2025-10-07T18:24:24.252393image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile45.5
Q1223.5
median446
Q3668.5
95-th percentile846.5
Maximum891
Range890
Interquartile range (IQR)445

Descriptive statistics

Standard deviation257.35384
Coefficient of variation (CV)0.57702655
Kurtosis-1.2
Mean446
Median Absolute Deviation (MAD)223
Skewness0
Sum397386
Variance66231
MonotonicityStrictly increasing
2025-10-07T18:24:24.445385image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11
 
0.1%
5991
 
0.1%
5881
 
0.1%
5891
 
0.1%
5901
 
0.1%
5911
 
0.1%
5921
 
0.1%
5931
 
0.1%
5941
 
0.1%
5951
 
0.1%
Other values (881)881
98.9%
ValueCountFrequency (%)
11
0.1%
21
0.1%
31
0.1%
41
0.1%
51
0.1%
61
0.1%
71
0.1%
81
0.1%
91
0.1%
101
0.1%
ValueCountFrequency (%)
8911
0.1%
8901
0.1%
8891
0.1%
8881
0.1%
8871
0.1%
8861
0.1%
8851
0.1%
8841
0.1%
8831
0.1%
8821
0.1%

Survived
Categorical

High correlation 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size50.6 KiB
0
549 
1
342 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters891
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0549
61.6%
1342
38.4%

Length

2025-10-07T18:24:24.603615image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-07T18:24:24.704242image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0549
61.6%
1342
38.4%

Most occurring characters

ValueCountFrequency (%)
0549
61.6%
1342
38.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)891
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0549
61.6%
1342
38.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)891
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0549
61.6%
1342
38.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)891
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0549
61.6%
1342
38.4%

Pclass
Categorical

High correlation 

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size50.6 KiB
3
491 
1
216 
2
184 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters891
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row1
3rd row3
4th row1
5th row3

Common Values

ValueCountFrequency (%)
3491
55.1%
1216
24.2%
2184
 
20.7%

Length

2025-10-07T18:24:24.801020image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-07T18:24:24.873231image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
3491
55.1%
1216
24.2%
2184
 
20.7%

Most occurring characters

ValueCountFrequency (%)
3491
55.1%
1216
24.2%
2184
 
20.7%

Most occurring categories

ValueCountFrequency (%)
(unknown)891
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3491
55.1%
1216
24.2%
2184
 
20.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown)891
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3491
55.1%
1216
24.2%
2184
 
20.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown)891
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3491
55.1%
1216
24.2%
2184
 
20.7%

Name
Text

Unique 

Distinct891
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size73.2 KiB
2025-10-07T18:24:25.282618image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length82
Median length52
Mean length26.965208
Min length12

Characters and Unicode

Total characters24026
Distinct characters60
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique891 ?
Unique (%)100.0%

Sample

1st rowBraund, Mr. Owen Harris
2nd rowCumings, Mrs. John Bradley (Florence Briggs Thayer)
3rd rowHeikkinen, Miss. Laina
4th rowFutrelle, Mrs. Jacques Heath (Lily May Peel)
5th rowAllen, Mr. William Henry
ValueCountFrequency (%)
mr521
 
14.4%
miss182
 
5.0%
mrs129
 
3.6%
william64
 
1.8%
john44
 
1.2%
master40
 
1.1%
henry35
 
1.0%
george24
 
0.7%
james24
 
0.7%
charles23
 
0.6%
Other values (1515)2538
70.0%
2025-10-07T18:24:25.807830image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2735
 
11.4%
r1958
 
8.1%
e1703
 
7.1%
a1657
 
6.9%
i1325
 
5.5%
n1304
 
5.4%
s1297
 
5.4%
M1128
 
4.7%
l1067
 
4.4%
o1008
 
4.2%
Other values (50)8844
36.8%

Most occurring categories

ValueCountFrequency (%)
(unknown)24026
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2735
 
11.4%
r1958
 
8.1%
e1703
 
7.1%
a1657
 
6.9%
i1325
 
5.5%
n1304
 
5.4%
s1297
 
5.4%
M1128
 
4.7%
l1067
 
4.4%
o1008
 
4.2%
Other values (50)8844
36.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown)24026
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2735
 
11.4%
r1958
 
8.1%
e1703
 
7.1%
a1657
 
6.9%
i1325
 
5.5%
n1304
 
5.4%
s1297
 
5.4%
M1128
 
4.7%
l1067
 
4.4%
o1008
 
4.2%
Other values (50)8844
36.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown)24026
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2735
 
11.4%
r1958
 
8.1%
e1703
 
7.1%
a1657
 
6.9%
i1325
 
5.5%
n1304
 
5.4%
s1297
 
5.4%
M1128
 
4.7%
l1067
 
4.4%
o1008
 
4.2%
Other values (50)8844
36.8%

Sex
Categorical

High correlation 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size53.8 KiB
male
577 
female
314 

Length

Max length6
Median length4
Mean length4.704826
Min length4

Characters and Unicode

Total characters4192
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmale
2nd rowfemale
3rd rowfemale
4th rowfemale
5th rowmale

Common Values

ValueCountFrequency (%)
male577
64.8%
female314
35.2%

Length

2025-10-07T18:24:25.912956image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-07T18:24:25.998893image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
male577
64.8%
female314
35.2%

Most occurring characters

ValueCountFrequency (%)
e1205
28.7%
m891
21.3%
a891
21.3%
l891
21.3%
f314
 
7.5%

Most occurring categories

ValueCountFrequency (%)
(unknown)4192
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e1205
28.7%
m891
21.3%
a891
21.3%
l891
21.3%
f314
 
7.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown)4192
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e1205
28.7%
m891
21.3%
a891
21.3%
l891
21.3%
f314
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown)4192
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e1205
28.7%
m891
21.3%
a891
21.3%
l891
21.3%
f314
 
7.5%

Age
Real number (ℝ)

High correlation 

Distinct88
Distinct (%)9.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.361582
Minimum0.42
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2025-10-07T18:24:26.106618image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0.42
5-th percentile6
Q122
median28
Q335
95-th percentile54
Maximum80
Range79.58
Interquartile range (IQR)13

Descriptive statistics

Standard deviation13.019697
Coefficient of variation (CV)0.44342625
Kurtosis0.99387102
Mean29.361582
Median Absolute Deviation (MAD)6
Skewness0.51024466
Sum26161.17
Variance169.5125
MonotonicityNot monotonic
2025-10-07T18:24:26.268670image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
28202
22.7%
2430
 
3.4%
2227
 
3.0%
1826
 
2.9%
1925
 
2.8%
3025
 
2.8%
2124
 
2.7%
2523
 
2.6%
3622
 
2.5%
2920
 
2.2%
Other values (78)467
52.4%
ValueCountFrequency (%)
0.421
 
0.1%
0.671
 
0.1%
0.752
 
0.2%
0.832
 
0.2%
0.921
 
0.1%
17
0.8%
210
1.1%
36
0.7%
410
1.1%
54
 
0.4%
ValueCountFrequency (%)
801
 
0.1%
741
 
0.1%
712
0.2%
70.51
 
0.1%
702
0.2%
661
 
0.1%
653
0.3%
642
0.2%
632
0.2%
624
0.4%

SibSp
Real number (ℝ)

High correlation  Zeros 

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.52300786
Minimum0
Maximum8
Zeros608
Zeros (%)68.2%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2025-10-07T18:24:26.396924image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum8
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.1027434
Coefficient of variation (CV)2.1084644
Kurtosis17.88042
Mean0.52300786
Median Absolute Deviation (MAD)0
Skewness3.6953517
Sum466
Variance1.2160431
MonotonicityNot monotonic
2025-10-07T18:24:26.498776image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0608
68.2%
1209
 
23.5%
228
 
3.1%
418
 
2.0%
316
 
1.8%
87
 
0.8%
55
 
0.6%
ValueCountFrequency (%)
0608
68.2%
1209
 
23.5%
228
 
3.1%
316
 
1.8%
418
 
2.0%
55
 
0.6%
87
 
0.8%
ValueCountFrequency (%)
87
 
0.8%
55
 
0.6%
418
 
2.0%
316
 
1.8%
228
 
3.1%
1209
 
23.5%
0608
68.2%

Parch
Real number (ℝ)

High correlation  Zeros 

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.38159371
Minimum0
Maximum6
Zeros678
Zeros (%)76.1%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2025-10-07T18:24:26.594206image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.80605722
Coefficient of variation (CV)2.1123441
Kurtosis9.7781252
Mean0.38159371
Median Absolute Deviation (MAD)0
Skewness2.749117
Sum340
Variance0.64972824
MonotonicityNot monotonic
2025-10-07T18:24:26.696293image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0678
76.1%
1118
 
13.2%
280
 
9.0%
55
 
0.6%
35
 
0.6%
44
 
0.4%
61
 
0.1%
ValueCountFrequency (%)
0678
76.1%
1118
 
13.2%
280
 
9.0%
35
 
0.6%
44
 
0.4%
55
 
0.6%
61
 
0.1%
ValueCountFrequency (%)
61
 
0.1%
55
 
0.6%
44
 
0.4%
35
 
0.6%
280
 
9.0%
1118
 
13.2%
0678
76.1%

Ticket
Text

Distinct681
Distinct (%)76.4%
Missing0
Missing (%)0.0%
Memory size55.6 KiB
2025-10-07T18:24:27.072528image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Length

Max length18
Median length17
Mean length6.7508418
Min length3

Characters and Unicode

Total characters6015
Distinct characters35
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique547 ?
Unique (%)61.4%

Sample

1st rowA/5 21171
2nd rowPC 17599
3rd rowSTON/O2. 3101282
4th row113803
5th row373450
ValueCountFrequency (%)
pc60
 
5.3%
c.a27
 
2.4%
a/517
 
1.5%
ca14
 
1.2%
ston/o12
 
1.1%
212
 
1.1%
sc/paris9
 
0.8%
w./c9
 
0.8%
soton/o.q8
 
0.7%
3470827
 
0.6%
Other values (709)955
84.5%
2025-10-07T18:24:27.525305image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3746
12.4%
1689
11.5%
2594
9.9%
7490
8.1%
4464
 
7.7%
6422
 
7.0%
0406
 
6.7%
5387
 
6.4%
9328
 
5.5%
8282
 
4.7%
Other values (25)1207
20.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)6015
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3746
12.4%
1689
11.5%
2594
9.9%
7490
8.1%
4464
 
7.7%
6422
 
7.0%
0406
 
6.7%
5387
 
6.4%
9328
 
5.5%
8282
 
4.7%
Other values (25)1207
20.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)6015
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3746
12.4%
1689
11.5%
2594
9.9%
7490
8.1%
4464
 
7.7%
6422
 
7.0%
0406
 
6.7%
5387
 
6.4%
9328
 
5.5%
8282
 
4.7%
Other values (25)1207
20.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)6015
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3746
12.4%
1689
11.5%
2594
9.9%
7490
8.1%
4464
 
7.7%
6422
 
7.0%
0406
 
6.7%
5387
 
6.4%
9328
 
5.5%
8282
 
4.7%
Other values (25)1207
20.1%

Fare
Real number (ℝ)

High correlation  Zeros 

Distinct248
Distinct (%)27.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.204208
Minimum0
Maximum512.3292
Zeros15
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2025-10-07T18:24:27.818425image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7.225
Q17.9104
median14.4542
Q331
95-th percentile112.07915
Maximum512.3292
Range512.3292
Interquartile range (IQR)23.0896

Descriptive statistics

Standard deviation49.693429
Coefficient of variation (CV)1.5430725
Kurtosis33.398141
Mean32.204208
Median Absolute Deviation (MAD)6.9042
Skewness4.7873165
Sum28693.949
Variance2469.4368
MonotonicityNot monotonic
2025-10-07T18:24:27.999897image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.0543
 
4.8%
1342
 
4.7%
7.895838
 
4.3%
7.7534
 
3.8%
2631
 
3.5%
10.524
 
2.7%
7.92518
 
2.0%
7.77516
 
1.8%
7.229215
 
1.7%
015
 
1.7%
Other values (238)615
69.0%
ValueCountFrequency (%)
015
1.7%
4.01251
 
0.1%
51
 
0.1%
6.23751
 
0.1%
6.43751
 
0.1%
6.451
 
0.1%
6.49582
 
0.2%
6.752
 
0.2%
6.85831
 
0.1%
6.951
 
0.1%
ValueCountFrequency (%)
512.32923
0.3%
2634
0.4%
262.3752
0.2%
247.52082
0.2%
227.5254
0.4%
221.77921
 
0.1%
211.51
 
0.1%
211.33753
0.3%
164.86672
0.2%
153.46253
0.3%

Embarked
Categorical

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size50.6 KiB
S
646 
C
168 
Q
77 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters891
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowS
2nd rowC
3rd rowS
4th rowS
5th rowS

Common Values

ValueCountFrequency (%)
S646
72.5%
C168
 
18.9%
Q77
 
8.6%

Length

2025-10-07T18:24:28.141879image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-07T18:24:28.225111image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
s646
72.5%
c168
 
18.9%
q77
 
8.6%

Most occurring characters

ValueCountFrequency (%)
S646
72.5%
C168
 
18.9%
Q77
 
8.6%

Most occurring categories

ValueCountFrequency (%)
(unknown)891
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
S646
72.5%
C168
 
18.9%
Q77
 
8.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown)891
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
S646
72.5%
C168
 
18.9%
Q77
 
8.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown)891
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
S646
72.5%
C168
 
18.9%
Q77
 
8.6%

Has_Cabin
Categorical

High correlation 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size50.6 KiB
0
687 
1
204 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters891
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0687
77.1%
1204
 
22.9%

Length

2025-10-07T18:24:28.329763image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-07T18:24:28.392554image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
0687
77.1%
1204
 
22.9%

Most occurring characters

ValueCountFrequency (%)
0687
77.1%
1204
 
22.9%

Most occurring categories

ValueCountFrequency (%)
(unknown)891
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0687
77.1%
1204
 
22.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown)891
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0687
77.1%
1204
 
22.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown)891
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0687
77.1%
1204
 
22.9%

Age_Group
Categorical

High correlation 

Distinct5
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size1.5 KiB
Adult
535 
Middle-aged
195 
Teen
70 
Child
69 
Senior
 
22

Length

Max length11
Median length5
Mean length6.2592593
Min length4

Characters and Unicode

Total characters5577
Distinct characters18
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAdult
2nd rowMiddle-aged
3rd rowAdult
4th rowAdult
5th rowAdult

Common Values

ValueCountFrequency (%)
Adult535
60.0%
Middle-aged195
 
21.9%
Teen70
 
7.9%
Child69
 
7.7%
Senior22
 
2.5%

Length

2025-10-07T18:24:28.474366image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-07T18:24:28.557586image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
adult535
60.0%
middle-aged195
 
21.9%
teen70
 
7.9%
child69
 
7.7%
senior22
 
2.5%

Most occurring characters

ValueCountFrequency (%)
d1189
21.3%
l799
14.3%
e552
9.9%
A535
9.6%
u535
9.6%
t535
9.6%
i286
 
5.1%
g195
 
3.5%
a195
 
3.5%
-195
 
3.5%
Other values (8)561
10.1%

Most occurring categories

ValueCountFrequency (%)
(unknown)5577
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
d1189
21.3%
l799
14.3%
e552
9.9%
A535
9.6%
u535
9.6%
t535
9.6%
i286
 
5.1%
g195
 
3.5%
a195
 
3.5%
-195
 
3.5%
Other values (8)561
10.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown)5577
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
d1189
21.3%
l799
14.3%
e552
9.9%
A535
9.6%
u535
9.6%
t535
9.6%
i286
 
5.1%
g195
 
3.5%
a195
 
3.5%
-195
 
3.5%
Other values (8)561
10.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown)5577
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
d1189
21.3%
l799
14.3%
e552
9.9%
A535
9.6%
u535
9.6%
t535
9.6%
i286
 
5.1%
g195
 
3.5%
a195
 
3.5%
-195
 
3.5%
Other values (8)561
10.1%

Title
Categorical

High correlation 

Distinct5
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size52.1 KiB
Mr
517 
Miss
185 
Mrs
126 
Master
 
40
Rare
 
23

Length

Max length6
Median length2
Mean length2.7878788
Min length2

Characters and Unicode

Total characters2484
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMr
2nd rowMrs
3rd rowMiss
4th rowMrs
5th rowMr

Common Values

ValueCountFrequency (%)
Mr517
58.0%
Miss185
 
20.8%
Mrs126
 
14.1%
Master40
 
4.5%
Rare23
 
2.6%

Length

2025-10-07T18:24:28.645832image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-07T18:24:28.731112image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
mr517
58.0%
miss185
 
20.8%
mrs126
 
14.1%
master40
 
4.5%
rare23
 
2.6%

Most occurring characters

ValueCountFrequency (%)
M868
34.9%
r706
28.4%
s536
21.6%
i185
 
7.4%
a63
 
2.5%
e63
 
2.5%
t40
 
1.6%
R23
 
0.9%

Most occurring categories

ValueCountFrequency (%)
(unknown)2484
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
M868
34.9%
r706
28.4%
s536
21.6%
i185
 
7.4%
a63
 
2.5%
e63
 
2.5%
t40
 
1.6%
R23
 
0.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown)2484
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
M868
34.9%
r706
28.4%
s536
21.6%
i185
 
7.4%
a63
 
2.5%
e63
 
2.5%
t40
 
1.6%
R23
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown)2484
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
M868
34.9%
r706
28.4%
s536
21.6%
i185
 
7.4%
a63
 
2.5%
e63
 
2.5%
t40
 
1.6%
R23
 
0.9%

Family_Size
Real number (ℝ)

High correlation 

Distinct9
Distinct (%)1.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.9046016
Minimum1
Maximum11
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2025-10-07T18:24:28.805084image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile6
Maximum11
Range10
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.6134585
Coefficient of variation (CV)0.84713704
Kurtosis9.159666
Mean1.9046016
Median Absolute Deviation (MAD)0
Skewness2.7274415
Sum1697
Variance2.6032485
MonotonicityNot monotonic
2025-10-07T18:24:28.891230image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1537
60.3%
2161
 
18.1%
3102
 
11.4%
429
 
3.3%
622
 
2.5%
515
 
1.7%
712
 
1.3%
117
 
0.8%
86
 
0.7%
ValueCountFrequency (%)
1537
60.3%
2161
 
18.1%
3102
 
11.4%
429
 
3.3%
515
 
1.7%
622
 
2.5%
712
 
1.3%
86
 
0.7%
117
 
0.8%
ValueCountFrequency (%)
117
 
0.8%
86
 
0.7%
712
 
1.3%
622
 
2.5%
515
 
1.7%
429
 
3.3%
3102
 
11.4%
2161
 
18.1%
1537
60.3%

Family_Size_Category
Categorical

High correlation 

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size1.3 KiB
Alone
537 
Small
292 
Large
62 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters4455
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSmall
2nd rowSmall
3rd rowAlone
4th rowSmall
5th rowAlone

Common Values

ValueCountFrequency (%)
Alone537
60.3%
Small292
32.8%
Large62
 
7.0%

Length

2025-10-07T18:24:28.977520image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2025-10-07T18:24:29.039310image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
ValueCountFrequency (%)
alone537
60.3%
small292
32.8%
large62
 
7.0%

Most occurring characters

ValueCountFrequency (%)
l1121
25.2%
e599
13.4%
A537
12.1%
o537
12.1%
n537
12.1%
a354
 
7.9%
S292
 
6.6%
m292
 
6.6%
L62
 
1.4%
r62
 
1.4%

Most occurring categories

ValueCountFrequency (%)
(unknown)4455
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
l1121
25.2%
e599
13.4%
A537
12.1%
o537
12.1%
n537
12.1%
a354
 
7.9%
S292
 
6.6%
m292
 
6.6%
L62
 
1.4%
r62
 
1.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown)4455
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
l1121
25.2%
e599
13.4%
A537
12.1%
o537
12.1%
n537
12.1%
a354
 
7.9%
S292
 
6.6%
m292
 
6.6%
L62
 
1.4%
r62
 
1.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown)4455
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
l1121
25.2%
e599
13.4%
A537
12.1%
o537
12.1%
n537
12.1%
a354
 
7.9%
S292
 
6.6%
m292
 
6.6%
L62
 
1.4%
r62
 
1.4%

Interactions

2025-10-07T18:24:22.766736image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:18.427822image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:19.120197image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:20.417751image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:21.306964image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:22.033965image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:22.882191image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:18.525697image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:19.294479image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:20.567909image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:21.431779image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:22.191236image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:23.141057image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:18.634666image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:19.465380image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:20.734981image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:21.543486image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:22.329722image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:23.255833image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:18.743595image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:19.672154image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:20.885817image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:21.665882image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:22.443378image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:23.342916image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:18.868982image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:20.143790image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:21.010640image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:21.794784image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:22.550029image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:23.460120image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:19.000956image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:20.286056image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:21.160374image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:21.910590image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
2025-10-07T18:24:22.663978image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/

Correlations

2025-10-07T18:24:29.115154image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
AgeAge_GroupEmbarkedFamily_SizeFamily_Size_CategoryFareHas_CabinParchPassengerIdPclassSexSibSpSurvivedTitle
Age1.0000.8190.151-0.1830.2930.1260.278-0.2170.0350.2650.106-0.1450.1580.365
Age_Group0.8191.0000.0770.2630.3170.0890.2750.2780.0000.2410.1270.2570.1180.383
Embarked0.1510.0771.0000.0830.1280.1950.2260.0520.0000.2580.1110.0920.1640.130
Family_Size-0.1830.2630.0831.0000.8160.5290.0700.801-0.0500.1370.2050.8490.2150.252
Family_Size_Category0.2930.3170.1280.8161.0000.3070.2110.6070.0350.1850.3000.8110.2850.402
Fare0.1260.0890.1950.5290.3071.0000.5820.410-0.0140.4790.1890.4470.2830.097
Has_Cabin0.2780.2750.2260.0700.2110.5821.0000.0910.0630.7900.1340.1380.3130.158
Parch-0.2170.2780.0520.8010.6070.4100.0911.0000.0010.0220.2470.4500.1570.269
PassengerId0.0350.0000.000-0.0500.035-0.0140.0630.0011.0000.0320.066-0.0610.1040.040
Pclass0.2650.2410.2580.1370.1850.4790.7900.0220.0321.0000.1300.1480.3370.189
Sex0.1060.1270.1110.2050.3000.1890.1340.2470.0660.1301.0000.2060.5400.992
SibSp-0.1450.2570.0920.8490.8110.4470.1380.450-0.0610.1480.2061.0000.1870.294
Survived0.1580.1180.1640.2150.2850.2830.3130.1570.1040.3370.5400.1871.0000.565
Title0.3650.3830.1300.2520.4020.0970.1580.2690.0400.1890.9920.2940.5651.000

Missing values

2025-10-07T18:24:23.624235image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
A simple visualization of nullity by column.
2025-10-07T18:24:23.793765image/svg+xmlMatplotlib v3.10.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareEmbarkedHas_CabinAge_GroupTitleFamily_SizeFamily_Size_Category
0103Braund, Mr. Owen Harrismale22.010A/5 211717.2500S0AdultMr2Small
1211Cumings, Mrs. John Bradley (Florence Briggs Thayer)female38.010PC 1759971.2833C1Middle-agedMrs2Small
2313Heikkinen, Miss. Lainafemale26.000STON/O2. 31012827.9250S0AdultMiss1Alone
3411Futrelle, Mrs. Jacques Heath (Lily May Peel)female35.01011380353.1000S1AdultMrs2Small
4503Allen, Mr. William Henrymale35.0003734508.0500S0AdultMr1Alone
5603Moran, Mr. Jamesmale28.0003308778.4583Q0AdultMr1Alone
6701McCarthy, Mr. Timothy Jmale54.0001746351.8625S1Middle-agedMr1Alone
7803Palsson, Master. Gosta Leonardmale2.03134990921.0750S0ChildMaster5Large
8913Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)female27.00234774211.1333S0AdultMrs3Small
91012Nasser, Mrs. Nicholas (Adele Achem)female14.01023773630.0708C0TeenMrs2Small
PassengerIdSurvivedPclassNameSexAgeSibSpParchTicketFareEmbarkedHas_CabinAge_GroupTitleFamily_SizeFamily_Size_Category
88188203Markun, Mr. Johannmale33.0003492577.8958S0AdultMr1Alone
88288303Dahlberg, Miss. Gerda Ulrikafemale22.000755210.5167S0AdultMiss1Alone
88388402Banfield, Mr. Frederick Jamesmale28.000C.A./SOTON 3406810.5000S0AdultMr1Alone
88488503Sutehall, Mr. Henry Jrmale25.000SOTON/OQ 3920767.0500S0AdultMr1Alone
88588603Rice, Mrs. William (Margaret Norton)female39.00538265229.1250Q0Middle-agedMrs6Large
88688702Montvila, Rev. Juozasmale27.00021153613.0000S0AdultRare1Alone
88788811Graham, Miss. Margaret Edithfemale19.00011205330.0000S1AdultMiss1Alone
88888903Johnston, Miss. Catherine Helen "Carrie"female28.012W./C. 660723.4500S0AdultMiss4Small
88989011Behr, Mr. Karl Howellmale26.00011136930.0000C1AdultMr1Alone
89089103Dooley, Mr. Patrickmale32.0003703767.7500Q0AdultMr1Alone